In this tutorial we will go through some features of the remora
R package that allows users to conduct a number of analyses when working
with animal tracking or occurrence datasets. Here we won’t cover all the
features of the package, but will go through one of the workflows that
may be useful for some of the projects you are working on this week.
remora R packageThe remora R package was developed by a
multi-institutional team led by Fabrice Jaine from IMOS to allow users
of the IMOS Animal Tracking Facility (and the wider animal tracking
community) to be able to quickly and easily conduct quality control
checks on their data, and allow easy access to other remote sensing and
in situ data collected and managed by IMOS.
remora is an R package enabling the
integration of animal acoustic telemetry data with oceanographic
observations collected by ocean observing programs. It includes
functions for:
Whilst the functions in remora were primarily developed to work with acoustic telemetry data, the environmental data extraction and integration functionalities will work with other spatio-temporal ecological datasets (eg. satellite telemetry, species sightings records, fisheries catch records).
In this tutorial we will work specifically with the
extractEnv() function.
remora and other useful packages for this
tutorialLets Install some key packages that we will use for these to manipulate, analyse and visualise data. The code below also installs all the dependency packages, so may take a while. If you are using RStudio, you will only need to run this once if you dont already have these packages installed on your system.
## Installing standard R packages
install.packages(c("tidyverse",
"sf",
"mapview",
"terra",
"scales",
"remotes"),
dependencies = TRUE)
## We will now use the `remotes` package to install `remora` from the Github page
remotes::install_github('IMOS-AnimalTracking/remora', build_vignettes = TRUE, dependencies = TRUE)
remora to append IMOS environmental data to
occurrence dataThe imos_variables() function will help the user
identify currently available environmental layers that can be accessed
and associated. Variables include spatial layers including
bathy and dist_to_land, which provide distance
measures of bathymetry (in meters) and proximity to land (in
kilometers). Variable names that start with rs_ are
remotely-sensed or gridded environmental layers and variables starting
with moor_ include in situ sub-surface mooring
environmental layers. A new feature of remora is additional
capacity to access gridded environmental layers from the Bluelink
dataset (BRAN). These variable names start with ocean_ and
provides daily modeled oceanographic variables at different depths from
the surface to 4,509 m in depth.
library(remora)
imos_variables()
| Variable | Platform | Temporal resolution | Units | Function to use | Description | Source |
|---|---|---|---|---|---|---|
| bathy | Composite raster product |
|
meters | extractEnv() | Australian Bathymetry and Topography Grid. 250 m resolution. | Geosciences Australia |
| dist_to_land | Raster product |
|
kilometers | extractEnv() | Distance from nearest shoreline (in km). Derived from the high-resolution Open Street Map shoreline product. | This package |
| rs_sst | Satellite-derived raster product | daily (2002-07-04 - present) | degrees Celcius | extractEnv() | 1-day multi-swath multi-sensor (L3S) remotely sensed sea surface temperature (degrees Celcius) at 2 km resolution. Derived from the Group for High Resolution Sea Surface Temperature (GHRSST) | IMOS |
| rs_sst_interpolated | Raster product | daily (2006-06-12 - present) | degrees Celcius | extractEnv() | 1-day interpolated remotely sensed sea surface temperature (degrees Celcius) at 9 km resolution. Derived from the Regional Australian Multi-Sensor Sea surface temperature Analysis (RAMSSA, Beggs et al. 2010) system as part of the BLUElink Ocean Forecasting Australia project | IMOS |
| rs_chl | Satellite-derived raster product | daily (2002-07-04 - present) | mg.m-3 | extractEnv() | Remotely sensed chlorophyll-a concentration (OC3 model). Derived from the MODIS Aqua satellite mission. Multi-spectral measurements are used to infer the concentration of chlorophyll-a, most typically due to phytoplankton, present in the water (mg.m-3). | IMOS |
| rs_current | Composite raster product | daily (1993-01-01 - present) | ms-1; degrees | extractEnv() | Gridded (adjusted) sea level anomaly (GSLA), surface geostrophic velocity in the east-west (UCUR) and north-south (VCUR) directions for the Australasian region derived from the IMOS Ocean Current project. Two additional variables are calculated: surface current velocity (ms-1) and bearing (degrees). | IMOS |
| rs_salinity | Satellite-derived raster product | weekly (2011-08-25 - 2015-06-07) | psu | extractEnv() | 7-day composite remotely sensed salinity. Derived from the NASA Aquarius satellite mission (psu). | IMOS |
| rs_turbidity | Satellite-derived raster product | daily (2002-07-04 - present) | m-1 | extractEnv() | Diffuse attenuation coefficient at 490 nm (K490) indicates the turbidity of the water column (m-1). The value of K490 represents the rate which light at 490 nm is attenuated with depth. For example a K490 of 0.1/meter means that light intensity will be reduced one natural log within 10 meters of water. Thus, for a K490 of 0.1, one attenuation length is 10 meters. Higher K490 value means smaller attenuation depth, and lower clarity of ocean water. | IMOS |
| rs_npp | Satellite-derived raster product | daily (2002-07-04 - present) | mgC.m_2.day-1 | extractEnv() | Net primary productivity (OC3 model and Eppley-VGPM algorithm). Modelled product used to compute an estimate of the Net Primary Productivity (NPP). The model used is based on the standard vertically generalised production model (VGPM). The VGPM is a “chlorophyll-based” model that estimates net primary production from chlorophyll using a temperature-dependent description of chlorophyll-specific photosynthetic efficiency. For the VGPM, net primary production is a function of chlorophyll, available light, and the photosynthetic efficiency. The only difference between the Standard VGPM and the Eppley-VGPM is the temperature-dependent description of photosynthetic efficiencies, with the Eppley approach using an exponential function to account for variation in photosynthetic efficiencies due to photoacclimation. | IMOS |
| moor_sea_temp | Fixed sub-surface moorings | hourly | degrees Celcius | extractMoor() | Depth-integrated in-situ, hourly time-series measurements of sea temperature (degrees Celcius) at fixed mooring locations | IMOS |
| moor_psal | Fixed sub-surface moorings | hourly | psu | extractMoor() | Depth-integrated in-situ, hourly time-series measurements of salinity (psu) at fixed mooring locations | IMOS |
| moor_ucur | Fixed sub-surface moorings | hourly | ms-1 | extractMoor() | Depth-integrated in-situ, hourly time-series measurements of subsurface geostrophic current velocity in the east-west direction (ms-1) at fixed mooring locations | IMOS |
| moor_vcur | Fixed sub-surface moorings | hourly | ms-1 | extractMoor() | Depth-integrated in-situ, hourly time-series measurements of subsurface geostrophic current velocity in the north-south direction (ms-1) at fixed mooring locations | IMOS |
| ocean_temp | 3D Raster product | daily (1993-01-01 - present) | degrees Celcius | extractBlue() | Water temperature at specified depth from the surface to 4,509-m depth | Bluelink (CSIRO) |
| ocean_salt | 3D Raster product | daily (1993-01-01 - present) | psu | extractBlue() | Water salinity at specified depth from the surface to 4,509-m depth | Bluelink (CSIRO) |
| ocean_u | 3D Raster product | daily (1993-01-01 - present) | ms-1 | extractBlue() | Current horizontal speed component in the x direction at specified depth from the surface to 4,509-m depth | Bluelink (CSIRO) |
| ocean_v | 3D Raster product | daily (1993-01-01 - present) | ms-1 | extractBlue() | Current horizontal speed component in the y direction at specified depth from the surface to 4,509-m depth | Bluelink (CSIRO) |
| ocean_w | 3D Raster product | daily (1993-01-01 - present) | ms-1 | extractBlue() | Current vertical speed component at specified depth from the surface to 4,509-m depth | Bluelink (CSIRO) |
| ocean_eta_t | 3D Raster product | daily (1993-01-01 - present) | meters | extractBlue() | Sea surface height at the water surface | Bluelink (CSIRO) |
| ocean_mld | 3D Raster product | daily (1993-01-01 - present) | meters | extractBlue() | Mixed layer depth in relation to the water surface | Bluelink (CSIRO) |
| air_wind | Raster product | daily (1993-01-01 - present) | ms-1; degrees clockwise | extractBlue() | Returns both wind speed (ms-1) and direction (° clockwise) as two separate columns | Bluelink (CSIRO) |
extractEnv() functionIn this first example, we will extract the variable Interpolated sea surface temperature and append it to the ‘fish_occurrence’ dataset.
Environmental variable can only be accessed one at a time using the
extractEnv() function. There are a few parameters within
the function that can help the user identify the variable required, and
to manage the downloaded environmental layers:
imos_variables() for available
variables and variable names)imos.cache
within your working directory.TRUE it can be time and memory consuming for long
projectsimos.cache where downloaded rasters should be saved. If
none provided automatic folder names generated based on study
extentThe extractEnv() function was designed to take in
minimal input data (i.e., latitude, longitude, date) to be able to
extract and append user defined environmental data stored and maintained
on the AODN THREDDS server. In most cases, delayed-mode datasets are
prefered, but if DM data is not available for the time period, Near-Real
Time data can be accessed instead. Although a lot of the other functions
in this package were created specifically to handle Acoustic Telemetry
datasets, this function was generalised so it can be used for other
datasets that require temporally and spatially variable environmental
data exraction.
Lets start with a simple example using a dataset of occurrences of three species of fish along the East coast of Australia. This dataset was collated from the Atlas of Living Australia. The dataset consists of occurrence records of Common Coral Trout (Plectropomus leopardus), Snapper (Chrysophrys auratus) and Spangled Emperor (Lethrinus nebulosus) between 2011 and 2024.
Each occurrence point here is associated with a source of the occurrence record, geographic coordinates (longitude and latitude) and a date that the occurrence was recorded. Before we start extracting IMOS data for these records, lets have a look at it closer.
library(tidyverse)
library(sf)
library(mapview)
## Species Occurrence dataset
fish_occurrence <- read_csv("data/Fish_occurrence.csv", show_col_types = FALSE)
## Data to plot land
land <- map_data("world")
## Using ggplot to visualise the data
fish_occurrence %>%
ggplot(aes(x = longitude, y = latitude, col = common_name)) +
geom_polygon(data = land, aes(x = long, y = lat, group = group),
fill = "grey", inherit.aes = F) +
geom_point() +
coord_equal(xlim = c(142, 160), ylim = c(-43, -9)) +
labs(x = NULL, y = NULL, col = "Species") +
theme_bw()
## Lets use the `sf` and `mapview` to interactively plot these data
fish_occurrence %>%
st_as_sf(coords = c("longitude", "latitude"), crs = 4326) %>%
mapview(zcol = "common_name")
Using the example fish_occurrence data, lets extract sea
surface temperature data to explore the thermal niches of multiple fish
species:
## This can take a lot of time (over 40 mins sometimes!),
## so lets just run a small subset during the tutorial!
fish_env <-
fish_occurrence %>%
# slice(5:7) %>%
extractEnv(X = "longitude",
Y = "latitude",
datetime = "date",
env_var = "rs_sst_interpolated",
cache_layers = F,
fill_gaps = T,
full_timeperiod = F,
.parallel = F)
Since we have limited time, and cant let the function run for the full dataset, here is what it looks like if you let it run and append Interpolated SST data for the full dataset:
## Lets load up the dataset I prepared earlier!
fish_sst <- read_csv("data/fish_sst.csv", show_col_types = FALSE)
## Lets map the data interactively
fish_sst %>%
st_as_sf(coords = c("longitude", "latitude"), crs = 4326) %>%
mapview(zcol = "rs_sst_interpolated")
## Splitting the data by species
fish_sst %>%
ggplot(aes(x = longitude, y = latitude, color = rs_sst_interpolated)) +
geom_polygon(data = land, aes(x = long, y = lat, group = group),
fill = "grey", inherit.aes = F) +
geom_point() +
coord_equal(xlim = c(142, 160), ylim = c(-43, -9)) +
scale_color_viridis_c() +
scale_fill_viridis_c(option = "B") +
labs(x = NULL, y = NULL, col = "SST Interpolated (˚C)") +
facet_wrap(~common_name, nrow = 1) +
theme_bw()
Apart from the spatial plotting of the data, we can try and look at how the thermal niche of these species differ:
library(patchwork)
a <-
fish_sst %>%
ggplot(aes(x = rs_sst_interpolated, y = fct_rev(common_name), fill = common_name)) +
geom_boxplot(show.legend = FALSE, alpha = 0.5) +
scale_fill_viridis_d() +
theme_void()
b <-
fish_sst %>%
ggplot(aes(x = rs_sst_interpolated, fill = common_name)) +
geom_histogram(alpha = 0.5, position = "identity", bins = 40) +
scale_fill_viridis_d() +
labs(x = "SST Interpolated (˚C)", y = "Count", fill = "Common Name") +
theme_bw() + theme(legend.position = "inside", legend.position.inside = c(0.15, 0.8))
(a/b) + plot_layout(ncol = 1, heights = c(0.1, 1))
In the second example, we will use the extractEnv()
function to append current data to animal tracking datasets. Here we
will use data from GPS CTD tags deployed on Flatback Turtles
(Natator depressus) from a key rookery on Gurriba Island in the
Crocodile Islands Group (Milingimbi) in the Northern Territory. Female
turtles were tagged once they finished nesting on the island, and
conducted their post-nesting migratinos to key foraging grounds across
Northern Australia. The tags collected profile data on temperature and
salinity as the turtles dived, and transmitted data via FastLock
GPS.
For this example, lets use a subset of this data and append ocean current data to the tracks to understand if these turtles passively drift with the current or actively swim against them:
turtle_tracks <- read_csv("Data/Turtle_tracks.csv", show_col_types = FALSE)
Lets explore the track data by plotting it
## quick interactive plot
turtle_tracks %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
mapview(zcol = "id")
## static plot
turtle_tracks %>%
ggplot(aes(x = lon, y = lat, color = id, group = id)) +
geom_polygon(data = land, aes(x = long, y = lat, group = group), fill = "grey", inherit.aes = F) +
geom_point(cex = 0.5) +
geom_path() +
coord_equal(xlim = c(120, 145), ylim = c(-19, -8)) +
scale_color_viridis_d() +
labs(x = NULL, y = NULL, col = "Turtle Name") +
theme_bw()
We can use the extractEnv() function to append current
data to the tracking data. In the code below we will also cache layers
to your system, so you can access the gridded datasets locally for
plotting or further analysis. We will also ask the function to look for
Near-Real Time data if Delayed-mode data is unavailable:
## Again, this will take a while so I suggest not running this during the tutorial
turtle_curr <-
turtle_tracks %>%
# slice(1:6) %>%
extractEnv(
X = "lon",
Y = "lat",
datetime = "date",
env_var = "rs_current",
cache_layers = T,
folder_name = "turtle",
fill_gaps = T,
full_timeperiod = F,
.parallel = F,
nrt = TRUE,
output_format = ".tif"
)
The extraction of current data here will append 3 datasets to each position:
In addition to this, the function also uses these variables to calculate current velocity (‘rs_current_velocity’) and current bearing (‘rs_current_bearing’)
## Lets load up the data I ran earlier
turtle_curr <- read_csv("data/turtle_curr.csv", show_col_types = FALSE)
Now that we have extracted the current data, lets visualise the tracking data with the downloaded GSLA data:
library(terra)
library(scales)
gsla <- rast("data/rs_gsla.tif")
plot(gsla)
## Plot the gridded data with the tracks
# convert the gridded data into a dataframe to plot
gsla_df <-
as.data.frame(gsla, xy = TRUE) %>%
pivot_longer(col = -c(1:2), names_to = "rs_date", values_to = "gsla")
turtle_curr %>%
ggplot(aes(x = lon, y = lat, color = id, group = id)) +
geom_raster(data = gsla_df, aes(x = x, y = y, fill = gsla, group = rs_date), inherit.aes = F) +
geom_polygon(data = land, aes(x = long, y = lat, group = group), fill = "grey", inherit.aes = F) +
geom_point(cex = 0.5) +
geom_path() +
coord_equal(xlim = c(120, 145), ylim = c(-19, -8)) +
scale_color_viridis_d() +
facet_wrap(~rs_date) +
labs(x = NULL, y = NULL, col = "Turtle Name") +
theme_bw()
## Plotting this interactively
turtle_curr %>%
st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
mapview(zcol = "rs_current_velocity", layer.name = "Current Velocity") +
mapview(gsla[[1]], layer.name = "GSLA")
We can also look at how the direction and velocity of the surface currents affected the turtles along their migration path:
## We first need to convert the current bearing into an angular degree
## and radius for the forthcoming ggplot geom_spoke() plot
turtle_curr <-
turtle_curr %>%
mutate(degree_angle = (450 - rs_current_bearing) %% 360,
radius = degree_angle*pi/180)
## Plotting the current direction and velocity for each position over time
turtle_curr %>%
ggplot(aes(x = date, y = id, color = rs_current_velocity)) +
geom_point() +
geom_spoke(aes(x = date, y = id, angle = degree_angle, radius = radius),
arrow = arrow(length = unit(0.05, 'inches')), lwd = 0.5) +
labs(x = NULL, y = NULL, col = "Current Velocity (m/s)") +
scale_x_date(date_breaks = "1 month", date_labels = "%b") +
scale_color_viridis_c() +
theme_bw()
## Mapping the current direction and velocity along each turtle track
turtle_curr %>%
ggplot(aes(x = lon, y = lat, color = rs_current_velocity, group = id,
angle = degree_angle, radius = rescale(radius, 1, 1.5))) +
geom_polygon(data = land, aes(x = long, y = lat, group = group),
fill = "grey", inherit.aes = F) +
geom_point() +
geom_path() +
geom_spoke(arrow = arrow(length = unit(0.05, 'inches')), lwd = 0.5) +
coord_equal(xlim = c(120, 145), ylim = c(-19, -8)) +
scale_color_viridis_c() +
scale_fill_viridis_c(option = "B") +
labs(x = NULL, y = NULL, col = "Current Velocity (m/s)") +
facet_wrap(~id, ncol = 1) +
theme_bw()